# End-to-End Learning
Wavlm Bert Fusion S Emotion Russian Resd
A multimodal fusion model based on WavLM and BERT, suitable for joint speech and text task processing.
Speech Recognition
Transformers

W
Aniemore
298
3
Control V11p Sd15 Inpaint
Openrail
ControlNet v1.1 is a conditional control model for image inpainting tasks based on Stable Diffusion.
Image Generation Other
C
lllyasviel
38.44k
118
Ast Finetuned Audioset 16 16 0.442
Bsd-3-clause
An audio spectrogram transformer fine-tuned on the AudioSet dataset, utilizing a vision transformer architecture to process audio spectrograms, achieving excellent performance in audio classification tasks.
Audio Classification
Transformers

A
MIT
35
1
Ast Finetuned Audioset 10 10 0.448 V2
Bsd-3-clause
An audio spectrogram transformer fine-tuned on the AudioSet dataset, which converts audio into spectrograms and processes them using a vision transformer, excelling in audio classification tasks.
Audio Classification
Transformers

A
MIT
2,072
0
Ast Finetuned Audioset 10 10 0.450
Bsd-3-clause
An audio spectrogram transformer fine-tuned on the AudioSet dataset, utilizing ViT architecture for processing audio spectrograms, achieving excellent performance in audio classification tasks.
Audio Classification
Transformers

A
MIT
109
4
Wav2vec Speech Project
A speech processing model based on the wav2vec architecture, with unspecified specific uses and training data
Speech Recognition
Transformers

W
maryam359
16
0
Wav2vec2 Xls R 300m Demo Colab
Apache-2.0
This model is a fine-tuned speech recognition model based on facebook/wav2vec2-xls-r-300m on the common_voice dataset
Speech Recognition
Transformers

W
Mahalakshmi
16
0
Featured Recommended AI Models